11 research outputs found

    Methodologies and Toolflows for the Predictable Design of Reliable and Low-Power NoCs

    Get PDF
    There is today the unmistakable need to evolve design methodologies and tool ows for Network-on-Chip based embedded systems. In particular, the quest for low-power requirements is nowadays a more-than-ever urgent dilemma. Modern circuits feature billion of transistors, and neither power management techniques nor batteries capacity are able to endure the increasingly higher integration capability of digital devices. Besides, power concerns come together with modern nanoscale silicon technology design issues. On one hand, system failure rates are expected to increase exponentially at every technology node when integrated circuit wear-out failure mechanisms are not compensated for. However, error detection and/or correction mechanisms have a non-negligible impact on the network power. On the other hand, to meet the stringent time-to-market deadlines, the design cycle of such a distributed and heterogeneous architecture must not be prolonged by unnecessary design iterations. Overall, there is a clear need to better discriminate reliability strategies and interconnect topology solutions upfront, by ranking designs based on power metric. In this thesis, we tackle this challenge by proposing power-aware design technologies. Finally, we take into account the most aggressive and disruptive methodology for embedded systems with ultra-low power constraints, by migrating NoC basic building blocks to asynchronous (or clockless) design style. We deal with this challenge delivering a standard cell design methodology and mainstream CAD tool ows, in this way partially relaxing the requirement of using asynchronous blocks only as hard macros

    A transition-signaling bundled data NoC switch architecture for cost-effective GALS multicore systems

    No full text
    Download Citation Email Print Request Permissions Save to Project Asynchronous networks-on-chip (NoCs) are an appealing solution to tackle the synchronization challenge in modern multicore systems through the implementation of a GALS paradigm. However, they have found only limited applicability so far due to two main reasons: the lack of proper design tool flows as well as their significant area footprint over their synchronous counterparts. This paper proposes a largely unexplored design point for asynchronous NoCs, relying on transition-signaling bundled data, which contributes to break the above barriers. Compared to an existing lightweight synchronous switch architecture, xpipesLite, the post-layout asynchronous switch achieved a 71% reduction in area, up to 85% reduction in overall power consumption, and a 44% average reduction in energy-per-flit, while mastering the more stringent timing assumptions of this solution with a semi-automated synthesis flow

    Power efficiency of switch architecture extensions for fault tolerant NoC design

    No full text
    The increasingly parallel landscape of embedded computing platforms is bringing the reliability concern for the on-chip interconnection network (NoC) to the forefront. While very few works in the open literature bring their error recovery mechanisms down to microarchitectural and physical implementation, this paper documents the effort of optimizing a baseline NoC switch architecture for different fault-tolerant strategies against single-event upsets. As key contributions achieved, we not only come up with a new efficient fault-tolerant flow control protocol, but also we contrast correction vs. retransmission oriented switch microarchitectures, each implementing both data and control path protection, with physical implementation awareness. The accuracy of the analysis methodology enables us to report counterintuitive power-reliability trade-offs between the design points, serving as guidelines for implementing fault-tolerant communication in a power-constrained environment

    Non-intrusive trace & debug NoC architecture with accurate timestamping for GALS SoCs

    No full text
    This work proposes a flexible and modular solution for nonintrusive tracing and debugging of software on embedded SoC platforms. It utilizes a separate, dedicated Network-on-Chip (NoC) interconnect with a hierarchical unidirectional ring topology to connect a multitude of monitoring devices. The devices are controlled via a debugger attached to the NoC. They use the network to receive control information and send back observations, which the debugger uses to construct a trace. The system utilizes a very accurate and efficient differential timestamping approach. It allows working with multi-synchronous SoCs, identifying concurrencies and other temporal properties in the SoC and coping with partial power downs and clock gatings. The proposed solution requires a low amount of hardware resources and at the same time provides unmatched capabilities

    Crossbar replication vs. sharing for virtual channel flow control in asynchronous NoCs: A comparative study

    No full text
    In on-chip interconnection networks, performance optimization techniques can be often achieved in two opposite ways: by making control logic more complex inside switches, or by pushing design complexity to the switch boundaries. The implementation of virtual channel (VC) flow control is an important application domain of this design trade-off. The data path of VC switches typically exhibits replicated buffers. The underlying philosophy (i.e., resource replication) can be pushed to the limit, thus incuring an apparently high area cost, while simplifying the switch control path. On the other hand, unreplicated resources require complex control logic for the sake of their efficient sharing among virtual networks. Investigating this design tradeoff is especially important for asynchronous networks, where the synthesis of complex control circuits is a challenge. This paper is a first step toward a design space exploration of VC implementation techniques for transition-signalling bundled-data asynchronous NoCs, and contrasts a VC switch with replicated crossbars against a unified-crossbar architecture relying on multistage switch allocation

    A vertically integrated and interoperable multi-vendor synthesis flow for predictable noc design in nanoscale technologies

    No full text
    We deliver a design flow for the synthesis and convergence of application-specific networks-on-chip. The flow comes with novel features that can better address nanoscale design challenges: front-end driven floorplanning, dynamic IR-drop minimization, fast and accurate system-level power grid modeling, predictable link design. Above all, such features are addressed by different prototype engines, even from different vendors, that can be smoothly integrated into the flow by means of a common specification format called Communication Exchange Format (CEF), that enables unprecedented tool interactions. This flow is validated by means of an extensive demonstration framework

    Assessing the energy break-even point between an optical NoC architecture and an aggressive electronic baseline

    No full text
    Many crossbenchmarking results reported in the open literature raise optimistic expectations on the use of optical networks-on-chip (ONoCs) for high-performance and low-power on-chip communication. However, most of those previous works ultimately fail to make a compelling case for chip-level nanophotonic NoCs, especially for the lack of aggressive electronic baselines (ENoC), and the poor accuracy in physical- and architecture-layer analysis of the ONoC. This paper aims at providing the guidelines and minimum requirements so that nanophotonic emerging technology may become of practical relevance. The key differentiating factor of this work consists of contrasting ONoC solutions with an aggressive ENoC architecture with realistic complexity, performance, and power figures, synthesized on an industrial 40nm low-power technology. At the same time, key physical design issues and network interface architecture requirements for the ONoC under test are carefully assessed, thus paving the way for a well-grounded definition of the requirements for the emerging ONoC technology to achieve the energy break-even point with respect to pure electronic interconnect solutions in future multi- and many-core systems

    Towards compelling cases for the viability of silicon-nanophotonic technology in future manycore systems

    No full text
    Many crossbenchmarking results reported in the open literature provide optimistic expectations on the use of optical networks-on-chip (ONoCs) for high-performance and low-power on-chip communication in future manycore systems. The goal of this paper is to highlight key methodological steps for a realistic assessment of the emerging nanophotonic technology. Building on this methodology, the paper provides an accurate energy efficiency comparison between an ONoC and an ENoC counterpart both at the level of the system interconnect and of the system as a whole. As a result, the paper points out the most promising directions for the development of the technology for the sake of practical relevance, and confirms that the technology has potential based on a characterization methodology with uncommon cross-layer visibility

    A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs

    Full text link
    © ACM, 2013. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PUBLICATION, ACM Transactions on Embedded Computing Systems, Vol. 12, No. 4, Article 106, Publication date: June 2013.http://doi.acm.org/10.1145/2485984.2485994[EN] Networks-on-chip need to survive to manufacturing faults in order to sustain yield. An effective testing and configuration strategy however implies two opposite requirements. One one hand, a fast and scalable built-in self-testing and self-diagnosis procedure has to be carried out concurrently at NoC switches. On the other hand, programming the NoC routing mechanism to go around faulty links and switches can be optimally performed by a centralized controller with global network visibility. To the best of our knowledge, this article proposes for the first time a global network testing and configuration strategy that meets the opposite requirements by means of a fault-tolerant dual network architecture and a fast configuration algorithm for the most common failure patterns. Experimental results report an area overhead as low as 12.5% with respect to the baseline switch architecture while achieving a high degree of fault tolerance. In fact, even when multiple stuck-at faults are considered, the capability of fault masking by the dual network is always over 80%, and the support for multiple link failures is more than 90% in presence of two unusable links in the main network with minimum set-up times.This work was supported by the NANOC European Project (FPT7-ICT-248972) and by the HiPEAC Network of Excellence (Interconnect Cluster).Ghiribaldi, A.; Ludovici, D.; Triviño, F.; Strano, A.; Flich Cardo, J.; Sanchez Garcia, JL.; Alfaro, F.... (2013). A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs. ACM Transactions in Embedded Computing Systems. 12(4):106:1-106:29. https://doi.org/10.1145/2485984.2485994S106:1106:2912

    Augmenting manycore programmable accelerators with photonic interconnect technology for the high-end embedded computing domain

    No full text
    There is today consensus on the fact that optical interconnects can relieve bandwidth density concerns at integrated circuit boundaries. However, when it comes to the extension of this emerging interconnect technology to on-chip communication as well, such consensus seems to fall apart. The main reason consists of a fundamental lack of compelling cases proving the superior performance and/or energy properties yielded by devices of practical interest, when re-architected around a photonically-integrated communication fabric. This paper takes its steps from the consideration that manycore computing platforms are gaining momentum in the high-end embedded computing domain in the form of general-purpose programmable accelerators. Hence, the performance and energy implications when augmenting these devices with optical interconnect technology are derived by means of an accurate benchmarking framework against an aggressively optimized electrical counterpart
    corecore